Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
4th International Conference on Machine Learning, Image Processing, Network Security and Data Sciences, MIND 2022 ; 1762 CCIS:203-219, 2022.
Article in English | Scopus | ID: covidwho-2273563

ABSTRACT

Intricate text mining techniques encompass various practices like classification of text, summarization and detection of topic, extraction of concept, search and retrieval of ideal content, document clustering along with many more aspects like sentiment extraction, text conversion, natural language processing etc. These practices in turn can be used to discover some non-trivial knowledge from a pool of text-based documents. Arguments, difference in opinions and confrontations in the form of words and phrases signify the knowledge regarding an ongoing situation. Extracting sentiment from text that is gathered from online networking web-based platforms entitles the task of text mining in the field of natural language processing. This paper presents a set of steps to optimize the text mining techniques in an attempt to simplify and recognize the aspect-based sentiments behind the content obtained from social media comments. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2.
1st Workshop on NLP for COVID-19 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 ; 2020.
Article in English | Scopus | ID: covidwho-2256286

ABSTRACT

In this paper, we present an information retrieval system on a corpus of scientific articles related to COVID-19. We build a similarity network on the articles where similarity is determined via shared citations and biological domain-specific sentence embeddings. Ego-splitting community detection on the article network is employed to cluster the articles and then the queries are matched with the clusters. Extractive summarization using BERT and PageRank methods is used to provide responses to the query. We also provide a Question-Answer bot on a small set of intents to demonstrate the efficacy of our model for an information extraction module. © ACL 2020.All right reserved.

3.
1st Workshop on NLP for COVID-19 at the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020 ; 2020.
Article in English | Scopus | ID: covidwho-2286073

ABSTRACT

The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many COVID-19 text mining and discovery systems. In this article, we describe the mechanics of dataset construction, highlighting challenges and key design decisions, provide an overview of how CORD-19 has been used, and describe several shared tasks built around the dataset. We hope this resource will continue to bring together the computing community, biomedical experts, and policy makers in the search for effective treatments and management policies for COVID-19. © ACL 2020.All right reserved.

4.
3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering, ICBAIE 2022 ; : 66-69, 2022.
Article in English | Scopus | ID: covidwho-2213211

ABSTRACT

Since the outbreak of COVID-19, academia has published tens of thousands of new papers. Facing so much literature knowledge, how to realize the fine-grained classification of covid-19 literature and help researchers carry out research? This is an urgent problem to be solved. This paper makes COVID-19 text classification graph data set, designs covid-19 scientific literature fine-grained classification model LC-GAT based on graph attention network, adds attention mechanism at word level, sentence level and graph level, effectively retains the classification information contained in article title and key words, and significantly improves the performance of covid-19 scientific literature fine-grained classification. This paper has positive significance for the classification of COVID-19 scientific literature. © 2022 IEEE.

5.
11th International Symposium on Information and Communication Technology, SoICT 2022 ; : 216-222, 2022.
Article in English | Scopus | ID: covidwho-2194132

ABSTRACT

This paper introduces a novel clustering-based framework for COVID-19 ontology construction using Pubmed LitCovid scientific research articles data. Our study uses a semantic approach with hierarchical clustering to construct a more effective COVID-19 documents ontology with medical labeling and search. We believe this study may initiate a future development for an advanced COVID-19 domain-specific ontology. The significant contribution from this research addresses solving the limitations in manual classification tasks of the everyday fast-increasing number of scientific papers and the overloading of their unclassified knowledge. With this research, our provision will help scholars with a better search mechanism to retrieve highly relevant expert information about their favorite topics in the COVID-19-related literature. To our best knowledge, this approach is the first successful attempt to apply auto clustering with labeling and search on the COVID-19 research papers. Moreover, in text processing, we propose a systematical evaluation without dependence on standard data collection to evaluate our methodology. © 2022 ACM.

6.
2nd International Conference on Advanced Research in Technologies, Information, Innovation and Sustainability, ARTIIS 2022 ; 1675 CCIS:524-534, 2022.
Article in English | Scopus | ID: covidwho-2173759

ABSTRACT

SARS-CoV-2 has bought many challenges to the world, socially, economically, and healthy habits. Even to those that have not experienced the sickness itself, and even though it has changed the lifestyle of the people across the world nation wise the effects of COVID-19 need to be analyzed and understood, analyzing a large amount of data is a process by itself, in this document details the analysis of the data collected from México by the Secretary of Health, the data was analyzed by implementing statistics, and classification methods known as K-Means, C&R Tree and TwoStep Cluster, using processed and unprocessed data. With the main emphasis on K-means. The study has the purpose of detecting what makes the highest impact on a person, to get sick, and succumb to the effects of the disease. In the study, it was found that in México the age of risk is at its highest at the age of 57, and the ones at the highest risk of mortality are those with hypertension and obesity, with those that present both at the age of 57 having a 19.37% of death. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

7.
2nd International Conference on Advanced Research in Technologies, Information, Innovation and Sustainability, ARTIIS 2022 ; 1676 CCIS:166-176, 2022.
Article in English | Scopus | ID: covidwho-2173753

ABSTRACT

In current times where there are smart devices for households, and that apart from having different functions that are helpful in daily household chores, such as being able to maintain a full pantry, and to generate errand lists, in the market these devices have a high cost for this reason is that it is proposed to create a low cost smart device, in this document an analysis is made using data mining, with tensor flow, of the purchases generated by the users, derived from the current situation by the pandemic of the COVID-19, generated an increase in online shopping, this analysis is intended to be a support for online errand shopping to visualize classification and prediction in the comparisons of users. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

8.
2022 Conference and Labs of the Evaluation Forum, CLEF 2022 ; 3180:656-659, 2022.
Article in English | Scopus | ID: covidwho-2012489

ABSTRACT

This paper is an overview of the approach taken by team AI Rational in CheckThat! 2022 for Task1 in English, Bulgarian, Dutch and Turkish. Task 1 is about classifying COVID-19 tweets and has four subtasks: 1A Check-worthiness;1B Verifiable factual claims detection;1C Harmful tweet detection;1D Attention-worthy tweet detection. This document will focus on the experiments done for 1A English where the team got first place out of 13 teams however the same techniques are done for the other languages and subtasks. This document will show our data preprocessing and data augmentation as well as the use of transformer models BERT, DistilBERT and RoBERTa for text classification and how we fined-tuned them for best results. © 2022 Copyright for this paper by its authors.

9.
2nd Joint Conference of the Information Retrieval Communities in Europe, CIRCLE 2022 ; 3178, 2022.
Article in English | Scopus | ID: covidwho-2011458

ABSTRACT

The evaluation of information retrieval systems is performed using test collections. The classical Cranfield evaluation paradigm is defined on one fixed corpus of documents and topics. Following this paradigm, several systems can only be compared over the same test collections (documents, topics, assessments). In this work, we explore in a systematic way the impact of similarity of test collections on the comparability of the experiments: characterizing the minimal changes between the collections upon which the performance of IR system evaluated can be compared. To do that, we create pair instances of sub-test collections from one reference collection with controlled overlapping elements, and we compare the Ranking of Systems (RoS) of a defined list of IR systems. We can then compute the probability that the RoS are the same across the sub-test collections. We experiment with our framework proposed on the TREC-COVID collections, and two of our findings show that: a) the ranking of systems, according to the MaP, is very stable even for overlaps smaller than 10% for documents, relevance assessments and positive relevance assessments sub-collections, and b) stability is not ensured for MaP, Rprec, Bpref and ndcg evaluation measures even when considering large overlap for the topics. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

10.
21st ACM Interaction Design and Children Conference, IDC 2022 ; : 696-699, 2022.
Article in English | Scopus | ID: covidwho-1962393

ABSTRACT

The role that technology plays in supporting children at school and at home is more prominent than ever before due to the global COVID-19 pandemic. This has prompted us to focus the 6th International and Interdisciplinary Perspectives on Children & Recommender and Information Retrieval Systems (KidRec) workshop on what the lasting changes will be to the design and development of child information retrieval systems. After two years, are information retrieval systems used more in and out of the classroom? Are they more interactive, more or less personalized? What is the impact on the research and business community? Are there long-term and unexpected changes on the design, ethics, and algorithms? The primary goal of our workshop continues to be to build community by bringing together researchers, practitioners, and other stakeholders from various backgrounds and disciplines to understand and advance information retrieval systems for children. © 2022 Owner/Author.

11.
2021 IEEE MIT Undergraduate Research Technology Conference, URTC 2021 ; 2021.
Article in English | Scopus | ID: covidwho-1788798

ABSTRACT

The COVID-19 pandemic has led to a multiplicity of research publications related to various aspects of coronavirus. Research topics range from COVID-19 transmission mechanisms to the public health response of various countries, and the publications need to be categorized for easier and more efficient access to resources. This paper explores various machine learning-based document classification techniques to categorize COVID-19 related literature. We integrate a novel terminology dictionary with machine learning models to study the dictionary's impact on the effectiveness of various classification techniques. We report a slight boost to F1 scores as a result of our modifications. © 2021 IEEE.

12.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; : 654-663, 2021.
Article in English | Scopus | ID: covidwho-1679075

ABSTRACT

In the Chinese medical insurance industry, the assessor's role is essential and requires significant efforts to converse with the claimant. This is a highly professional job that involves many parts, such as identifying personal information, collecting related evidence, and making a final insurance report. Due to the coronavirus (COVID-19) pandemic, the previous offline insurance assessment has to be conducted online. However, for the junior assessor often lacking practical experience, it is not easy to quickly handle such a complex online procedure, yet this is important as the insurance company needs to decide how much compensation the claimant should receive based on the assessor's feedback. In order to promote assessors' work efficiency and speed up the overall procedure, in this paper, we propose a dialogue-based information extraction system that integrates advanced NLP technologies for medical insurance assessment. With the assistance of our system, the average time cost of the procedure is reduced from 55 minutes to 35 minutes, and the total human resources cost is saved 30% compared with the previous offline procedure. Until now, the system has already served thousands of online claim cases. © 2021 Association for Computational Linguistics

13.
7th International Conference on Electrical, Electronics and Information Engineering, ICEEIE 2021 ; 2021.
Article in English | Scopus | ID: covidwho-1672730

ABSTRACT

The current pandemic has spread everywhere. Various effects of pressure in the economic, educational, and social sectors are forced to adjust. So that people really need information about efforts to prevent the spread is very necessary. Search Engine is a program that is used as a tool to find more information on the internet. Search Engine is one of the discussions in the field of Information Retrieval. This system is a document search of unstructured properties. Thus, being able to provide the information needs of a large set of documents (on a local computer server or the internet). The vector space model is one of the many models in Information Retrieval that is used to get the distance and direction between keywords and documents by representing them into vectors. Then the results of ranking using cosine similarity with a dataset of 90 articles about covid19 along with 4 keywords will be tested with precision, recall, and accuracy calculations. The results of the precision calculation get a value of 60% - 73%, recall gets a value for each of the keywords 81%-100% and gets an accuracy value of 85% -89%. The results of these experiments indicate that information retrieval with vector space model is effective with good and stable performance used for information retrieval. © 2021 IEEE.

SELECTION OF CITATIONS
SEARCH DETAIL